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^0 -This preliminary study attempts to determine the most effective 

strategies for health effects, for each of five selected Lockheed DIALOG 
data ba^s (BIOSIS Previews i Chemical Abstracts Condehsates NTIS, Envi- 
roline, and Pollution Abstracts), as the concept is used by the E^>A library, 
-Research Triangle Park, North Carolina, in relationship to substances. 

The effectiveness of dif&erent. strategies 'for specific data bases are " 

. ' 

tested, by determining recall and precisiort for an essential-or core 
strategy pJrus that of additional^ strategies and Comparing what the'^ddi- 
tiona'l terms and/or. codes did or did not. add 'to the recall and-4>recision^ 
of the efssential "strategy. , " .• ^ 0 

Tentative strategies were developed and summaries of dreas of the, 
searches still -requiring teeing" are included. Definite trends can be 
established for each datavbase.. Different strategies are required in 
each and levels of precision attainableWary .with each data base. With 
the tliiree larger d^ta bases (BIOSIS, , Chemcon, and NTIS), strategies were 
developed first, the searches run, results calculated, and strategies 
synthesi'zed from the results. With the two smaller " data bases. (Enviro- 

^line and Pollution Abstracts), strategics were developed by selecting 

i ^ ^ - ' *- ' . _ 

possible search terms/code?, irom reieVani: citations, testing hypothetical 

♦ • • ^- .* ' ^ 

search Strategies fqr recall and preoision, and then synthesizing more 

final strategies. 

Headings:, ^ ^ , 
» * . *. 

Online searching — Strategies and profiles 

^ ' ' ' " J'' ^ 

Lockheed DIALOG data bales - . ! ... 
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' I.-, INTRODUCTION ' ' 

/ . • - ! • V ' 

One of the most important topics searched on computer data bases at 

the Environmental Pyi;otectioh Agency library, at R.eseatqh Triangle 'Park, 

•North Carolina is'that of'hea?.th effect^'of substances/ This topic is 

. 7 . I - ; • ' 

searched in conjunction with some specific chemical or pollutant or group 

' ■ ■ ■ i '■ . . 

Q-f the same. .Health effects is a broad, amorphous, arid interdisciplinary! 

category, varying in ;the^^spect§ it covers and requiring numerous te'rms 

to be entered iAto the search strategy (profile). In addition, it is 

usually necessary to undertake the search- in at least three, and even as ' 

many as eleven or more, data bases because of the interdisciplinary 

nature ot the EPA's concerns.' Therefore, it would be l^elpful if the 

effectiveness of different strategies for Specific datd bases could .be- - 

' ^ 

determined in order to increase and assure accuracy, rejlevance, com- 

• ^ . ^ ' V i ^ 

pleteness (making certain important articles are not inadvertently 

missed)., and to cut down on the cost of a search where additional, 

^ • . » j . 

unnecessary terms can be excluded. The search results from such 

, • * ^ ' ' 

strategy testing should show terms that are essential \piGs additional 

■ ' \ •■ ' ./ ■ 

terms that produce a pattern of increasing 'recall--pos^j.bly reaching 

■ ' ' - . ' 

a plat&u— until an bpj&imun levet of. recall and relevande is reached 

after _^ich point' the relevance will'star^ dropping. 'sc|,^it should be 

possible to determine which profiles produce the most (J^s^rable results. 

.Thus, the purpose of this study.' is to attempt to determine what ^ 

are the most effective, strategies 'for health effects fotj each ' of five 



selected Lockheed DIALOG data bases" (Biolsis' Previews, Cljemical Abstracts 



Condensates, or Chemcon for short, NTIS, Envirol/ie, and Pollution 

Abstracts) used at the EPA library in Research/Triangle Park, North 

Carolina. The results should b,e considered /entative sifice the 

^ ' / ' • 

strategies are regarded as, preliminary until they can be ^tested with^ 

other pollutants and chemicals, especially .those with/health effects 
fering froto those of asbestos. 




/ 



11. REVIEW OF RELATED LlTERA'E|JRE 



Articles reporting a library 's. experierices with computerized 

, liter^tur,e. searching are fairly cpnmon at preseht. Some deal primarily 

with operational cos^s (Calk;Lns , 1927^) , others .:with the library users 

performing searches- directly thdmselves or with training users (Shearer; 

1975; Callaghan anli Howden, 1972; Hines, 1975)', while others give a 

• geno-ral overview which covers a range of the searching experience 

(Prewitt, 1974; Schipma, 1-974). Another common^ kind "of artdple is the 

Comparative %3^e -which frequentl'y overlaps the experiential type ' 

(Laurence., 1974). These vary in what is compared. Some concern them- • 
% • . ■ - ^ 

■ selves more with comparing se^rc^hing of different data bases' (Beauchamp, 

1973) i some^wi^h the usefulness .of different systems for retrieval ' * 

(Verheijen-'Voogd and Mathijsen, 1974; -Preyitt, 1975), some, do Ijoth •'. 

(Weiss, 1976). " * ' ' ' ' 

' - , j' 

A number oi pnlin'e" studies' have been undertaken in EPA libraries. 

'Calkins compared operational co^Jt^ of different ^online, systems with ' 

manual* and batch sefrphlng (1977). Loiig and McCuUough -compared ' ' 

' • '"'i>> ■ , ■ ' ' ' 

retrieval of different data bases aS they related to Environmental " 

.science search topics '.requested by researchers (1976; ,1975) • 

•Various combinations of. topics appear in the numerous articles 
that have IjTeen written on searching 'computierized data bases • ;in addition 

^50 aspects already mentioned » some de^l^ith in-hous^ ^ata hasps, others 
with commercial raes; some with online s^^arching, otli^s with bajch. The 
interest of this paper is a specific co^ercial online system^ that is,. 

. . . 3/' . . * . 



Lockheed's DIALOG, and wifh.he^Jth effects/profiling for five of its data 



bases « 



* ^ In the^area of literature on. search' strategies and/or profiling, 
much less has been 'written for elthgr biiliae or batch systems ,/ind lijftle 

^ goes beyond the usual ^generalities of searching tecjiniqufes that are- . . 
, included ^s only a portion of in aflrtic^le on^^cbmputerfzea literature ' 
sear9hing. .(Literature searches on searching strategies/pyof iles oa'" ' 

, BIO^IS, Chemcon 3 and 4, ERIC, and {TTIS ^lus a survey^of searching 
articles indebted in the last f ive years 5i JLibfary Literature and L'ibrary 
and Infor mation Science Abstracts were used to gain this aa^essment • ) 

^ Even less caii be found when on6 considers what has been writtfen specif- 

ically about DIALOG and , other systems that inco1:p6rat.e full-text search- 
**ing with more controlled .techniques? Very few studies exi^ that devote 
themselves only to. the indepth stiidy o^ all the constituents invSlved in^ 
profile or strategy'deveiopment. .iKis .may well be attributed to the cost 
involved" in performing such* studies. It cost approximately $3io in 
connect-timer and printing of citations offline to perform the study pre- ' 
seated iti^^is p^per. Several new publications devoted entirely to 

online Jjiformation sy^t^ are beginning publication this year, e.g,, 

Oi^ine, and .may fill , this.. gap'.—^Hhile some of -the data base loanuals 



supply advice on prdflle-^- development, and the'manuals te^ to be improv- 
inR'in quality a^d usefulneis.y they still do not supply searchers -with 



much of the practical, individualized advice that is -needed.) ^ 



W^ile the "paucity of practi-Q.al literature on the subject" of pro- 
fil<! construction h^s been reported by Butterly (1975) it is fairly 
«asy to find very.g6n^al guidelines such as - 
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The major subtasks involved in f raising, a request include; 
1) selecting' ■^e^rch t*'rms; ^2) augmenting tlie -terms with 
instances,. -aynohymous phrases or other related terms; 3) 
A)ressing the, logical relationships which exist between tenis; 

trying out aspects pf the request of the whole request in 
order to discover 'how. we^ it works; and, 5) ' explaining what 
to do'^^with the retrieved records once the user is sSti^fied. 
It is generally agreed that a 'major. advantage of interactive 
^ retrieval is tha? one. can revise a request to conform to what 
one discoyefg^a^our^he data base^ Some users are likely to 
cajrefully think out the request ahead of - time and proceed one • 
step to the .ne-xt. Others are likely to sMp right to the mid- 
dle and ad^-wor<te to their request while they browse (Martin. 
_ 1975, 79). . , . „ . ■ 



or 



!' Many, factors "can influence the success <or fallt^e of a- 

search. Pritaary causes.' ol failure^' are lack of appropriate 
terms' in our- controlled vocabulary (some terms are too general ' 
/and some too specific), laek' of specificity^ in indexing^ or ^ ' 

. omission of necessary terms, .and search" fdrmulatio^s .which do ' ' ' 
not ^equately cover the request. " Other ' failures lare causted " 

by inadequate user/fqrmulator -interac'tibn,(JenkinjS, ■;972,'^ ' - 

: • p. 4^5). ■ • *■ , - . . / . ■ ! ' 

Otfters, in describing th^r ^searching c;fcle,-al5o jTncltidi. free-text, 

" » * ' C 

iterative 'methqds and the resultant^^revision of search strategies , 

•<Prewitt; 1974, p. 117). or' Interactive programs , for preparing strategies 

as Sqhultz. described' for BICfSlS-asiIA, pp.' 5-9) • Sdmetimes 'Veview ' " 

articles' Supply searching guidelines. for a wide range of searching 

^ • f * / 
methods and/pr systems (St'eyen^S, 1974) r ' . ' ' ' / » 

•- Some articles in aji-'area closely-rtied to profiling aJ^e. those deal'^ 

„ ing with question negotiation in query formulation (Heim, 1975) and those 

dealing with indexing, its qya^ty (Farradane and Yates-Mercer, 1973)^" and 

^he effect/of different methods on retrieval efficiency (Schipma, 1976) • 

Even' those authdrs?%fho provide niwre detailed guidance in search 

profiling -are* quick .-to. point out difficulties' in ^providing such direc-"' 

•tions. As Lynch, says, • « • - \ ^ 
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Profile cojistruction ... is still largely a sub- 
jective and pragmatic process, depending to a considerable ' 
extent on skill and experience. For most jusers', it is * 
intricate, time consuming and remote- from their ncJrmal- prac- 
tices ?.n. consulting conventional soujces (-1974, p. 66). , ' , . " 

'Furthermore, according to ,J.*caster; ^ . ' 

It is usually difficult to make firm recommendations' ' 
relating to search strategies. Nevertheless, we must care- • - 
fully examine the faj^lui'e analyses with a view to assembling 
a collection .of pointers for searchers '(1968, p. 157). • ' 

.And, he does, provide apeful guidelines for both searching strat- 
egics and the concomitant failure analyses in several chapters in * 
Information Retrieval Systems (1968, "factors Affecting the Perfor^nce 
of an Information Retrieval System," pp. 64-78; • "Analysis of. the Test • 
Data," pp. 130-150; ^'Interpre^tion and Application of the Test Data," 
pp. 151-159; and "Searching Strategic," pp. 198-2^TXi j '~ ' 
^ ^ A summary of his guidelines is particularly germaine to this study ^ 
and is a6 follows: ' j .% 

1) a high.' level of exhaustivity of indexing makes' for 

high recall and low precision.- Qonversely, a low level 
of exhaustivity of indexing makes for low recall and high ' - ' 
precision (1968, p. 67), ' ' , ' . 

• ' highly, specific index language will allow high^ 
precision, .capabilities iii ^fearcfiing but will, also tend to 
reduce recall performance. An index language of low 
- specif iclti_wlll tend to produce high recall ^-igures' but ' * 
will not allow high precision performance (1968, p, 70). 7^ 

3) Exhaustivity of indexing and specificity of index langu^ 
govern the recall and precision ^capabilities an index. 
However, the searcher is able' to vary recall and precision # 

^ ^ performance for a particular search by the adoption of 

Various searching strategies (1968, p. 70). : 

6, » 

4) Given. the ability to vary our search.- fbrmulation (in 
order to retrieve inore documents or fewer documents as the 
situation demands), by moving up or down hierarchies, by 
substituting synonyms,' or by some other technique, we are* ^ 

able to carry out scotches of varying degree? of generality. * . 
For any search, or group of searches, we can thus vary the 
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position at which we choose to operate on a hypothetical 
performance curve. Thus we can decide between sacrificing 
p^^ecision and goipg all put for a high recall performance, 
or sacrificing recall to obtain a high precision search^ * ^ 
or we can adopt a com[pfpmisd'and operate somewhere in 
^>between (1968, p. 71). - \ ^ " 

- ' ' • 5) Relevance 'standa:rds -of users oi a retrospective 

searching system are obviously closely rej^ajt^d io .the^' 
' . - -generality oT'requests .1 \ .[With] a'^very^general request 
for the particular system being- evoked ri]t should b6 pos- 
• sible utO' achieve both a high recall and a high precision 
figure^ for sxiq^ a search since the requestor will accept 
- ^ny document that bears on the general subject . . . (1968,. 
^p. 78), » \ _ : • ^ ^ . 

6) , • System failures attributable to 'indexing ... are 
; [of]/ two d;L-stinct types . . .: <a) those due to indexer, - 
errors and (b) those due to ^ policy decision regarding 
the average number of terms assigned in indexing. Ihdexer v 
^ errors are themselves two types: (a) ^omission of a term 

or tetms necessary to describe an impprtant topic dis-, 
cussed in an article, and (b) use of a term that. appears 
inappropriate to the subject matter of the article. Omis- 
sions will normally lead to recall failures, while use of 
an inappropriate t'erm (i.e., sheer misindexling) can cause 
either a precision failure' (the searcher uses this term in 
a strategy and retrieves an irrelevant item) or a recall * 
failure (the searcher uses the correct terms and a x^anted ' 
/ document is missed bedause labeled with an incorrect termV - 
(1968, p; 140). * ' o ^ 

In capsule f orm,\ the princ^.p^l 'causes of, information retrieval 

systems failure f rom. liidexing are^^ 1) lack- of' spetifity, ladt of ekhaus 

tivity, omissionr of important concepts, and use of ^inappropriate" terms 

vhlch lead to recall failures; and 2) exhauative itfd^'xing arid, use of . 

inappropriate terms which "lead to precision failures. The pri-&cipal 

causes of failure' from searching aift.:^. 1) failure to-cover all reasonabi 

approaches to retrieval (e.g., not using one particular relevant term or 

term combination), too. exhaustive formulation, and too specific formula- 

tlon which lead to recall failures; and 2) not' sufficiently exhaustive 

formulation, not sufficiently, specific formulation, us.e of inappropriate 

terms or term combinations, and defects in search ^logic which lead" to 



'precision faiUtes (Lancaster., 1968, pp. 150, 143 j. 

* More specific sidelines and examples t<J' prof il'ing exist., such as 
^search -term selection, different-rfn'ds of t^hniques. in formulation, 
illustrative examples of why searches may fail (Lancaster, 1968. pp. 191,- 

'A 

207); term relationships and word distance Vequir^emefttS-^d- examples -of . 
-search strategies (Lancaster, Rapport, and Penry, 1972, pp>i 239-238) f^and 
■profile construction in controlled-vocabular;/ data bases free-text data 
•bases, and interactive systems (Lynch, 1974,' pp. 66-74). Scheffler 
describes a study in using Boolean NOT logic for improving SDI profile 
precision (19^72^,' while Smith offers Venn diagramming as dii aid in pro- 
file development (1976). \ . y' 

- In the area of access points in seaxchiug, Williams emphasizes that 

' ' * Many research projects have'anaiyze<^e utility of 

various access points— terms In titles, abstracts, extracts ^ 
digests, and controlled^ or uncoat rolled index terms,. key words 
and codes. The Recess points are evaluated with respect to 
recall, prjgcision, and volume of material that must be checked 
by the user. One cannot generalize from 'such studies, because 
they are specific t| certain data bases. The qual^.ty of.index- 
.ing and abstractingivaries Among data bases, 'and the injforjlia- ' 
tion content in titled varies among authors and amone fields 
a974, p. 233). J • , \r ' 

' A few examples of access point studies are the Beck^, Veal, and 

Wy^tt comparison of 'efficiency when searching titles, only, titles-plus- 

keywords, and titles-plus-abstracts in free-text cliemical, ^ata bases. 

(1972); the study by' Lancaster, Rapporty and Penry on. EARS (an epilepsy 

file)- comparing searching on abstract (plus index terms) versus index . 

terms alone (1972) ; the. Fisher and Elcheson comparison of £iie effects of 

♦ ♦ 

combining title words and Ijidex terms against using;only ^either one of 
these accesses on the Nuclear Science Abstracts 'file (1972);/&nd the 
Byrne* eva^tion of the, relative ef^ctiveness of se^rtiAtSg titles/ 
abstracts,, and subject headings for a' COMPEUrfEX datV ^ase (1975)* E>7Bn 



' (\ - 9 ' • • 

♦ 

though generalizations in this area can be risky, Byrne's assessment that 

"there is a general agreement that the addition of abstracts and/or 

other free-language words is beneficial with regalrd to recall" (1975, 

p.- 224) more often than not is acJurate, but one Lst realize that (his * 

addftion can -prof oundly reduce preLision. Roe', ^icuda ,' and . Seeds found ' 

that in searching the natural-langiage data, base CAIN (now AGRICOLA) 

. . . success with title-word! searching ^p^ears to vary ' > • 
inversely with the vocabulary 1 size of a ^pbject.' ' The more ' 
limJjted, precise, > or universal^ the terminology defining a 
subject, the greater the rate hf success). Thus searches 
involving scientific names of Anique processes were most 
successful in retrieving high Percentages of relevant cita- 
^ ^ tions without the nuisance' of 'kalse drops" (1975, p.. 7^6;. 

^il^ncaster and Fayen discuss searching Using dif f erent "feethods ^of vo^b- 

ulary control -,(1973, Chp. 11, "Vtocabul\ry. in t^ On-Line System," 

pp. 244-262). 

Examples gf articles providing inA)rmation on" searching techniques 
specific to the DIALOG systeta are^two which klso compare DI^OG with 



ORBIT. 



One of Weiss 's concerns was seai;cher keystrokes." He also gave a 



thorough- dis'cus'sion o£ system co^nds" (1976) . Prewitt compared search- (y 
ing Chemical Abstracts' Condensates on DIALOG and ORBIT Ind in the proc^^ ' 
covered searchable fields, subject- searching, truncaticjh ft^tures, ^ 
searching techniques, and-; provided exampl^e searches. (1975),- Both these 
articles necessarily 'provide mostly genei^al information. 

Durkin. and Smith discuss methods for retrieving environmental- 
sciences-related ^formation from -BIOSIS Previews including- use of the 

^CROSS, aiid BicJ-Systematic Indexes and .provide examples'ot search strat- ' 
egies (1975,^pp. 15-16). They do not discus.9 the general concept of 
health effects, however. Nees and Green evaluated the BIOSIS 'data base, 

/its indexes, -and several^ systema for ^searching it including DIALOG (1976). 



^ They concluded that "[ilnitial review of they.- Subject Guide to CR OSS 
■ t ° ■ ' ~ ' ~ — ~ 

Index, CROS'g Code, Blosystematlc Code , and ti^ide to the Vocab ulary of 

' ' .' i. • ' ~T^- ' ■ 

Biological Literature Is 'essential'-' In strategy preparation (19.76, p. 8). 

In addition to •provid'i^g an example of ^ DIAL0(3 strategy (1976, p. 33), ' 

they have a useful. sunBnary section lncludlng\technlques and .cauMons for 
Miosis on DIAlIqg ("a. Seai^hes foi-'Regulir Clients," pp. 16-25). Mbch of 

their observation and adyic'^ par^lels that of sekrchers at the EPA '" 
, library In Research Triangle Vaik, such as the powerfulness of the CROSS 

Code as a stfategy tool(1976, pp. 21, 37) and the general necessity of 

limiting to major levels. CkOS^S Codes in most searches to prevent • 

irrelevance and faj-se Srops (pp. 20-21X'. 
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III. -METHODOLOGY 



The'desciiptidri' of this sfudy and its results—thus the remaining ' 
chapters, of this paper— necessarily assurte experience with searching and 
are written accordingly. 

The methodology in this -study has two stages: l)/that Involved , in 

the development of .Che search strategies .^g^ng tested aAj 2) that 

I ... - 

involved in the analysj-s of results from the strategies, 

The-methods used for deyeloping the strategies^fxe of two types.. 

j • ' 

One was used for the three larger data bases— BIOSIS, Chemcon, and NTIs'- 
. the other for the ^twp smaller} data bases—Enviroline and Pollution 
'Abstracts. For all data baseL, however, the health effects will be those 

caused by, asbestos." The stored, asbestos strategy F87 that was ANDed with 
■the ot"her portions of each search appeai;s in Figure i; (The terms were 

selected from Standen, 1967.-) This helps limit the number 'of citatiohs 

1 V ' " _ 
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FiR. I. Stored asbestos 
siratejry F87. 

S ASBEST? 
S SERPENTINE ' 
S CHRYSOTILE. ' 
S AMPHIBOLE f \ ■ 
S ANTHOPHYLLITE 
S^AMOSITE ' '■ 
S FERROANTHOPHYLLITE- 
' S CROCIDOLITE i 
S TREMOLITE 
S ACTINOLITE j 
C 1-10/OR ' i ' 



to a manageable numbfer and also detetml^e relevance as asbestM/has a 
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^ wide range of health effect (Bogovski et al, 1973).' Further control-was 
obtained by limiting -BIO^IS, ,Cheocon, and NTIS tob a period of one year-- 
those citations published in .tlie 1576 volume.(s). of the ihdexes; Enviro- 
line to two years— 1975-1976; a^'d Jlollution Abstracts to five years- 

. 1972-1976. • ■ ^ * . ^ " J ■ . . 

■ ', ' ■ -1 , ■ .? - , • ' ' 

Methodology for Search Strat&gy DLv elopmeny-BIOSJS , Chemcon. and NTIS 
Data and citations for these ffrst three data bases Were obtained 
by running one search on each. Each' .search was divided intbTs^erai por- 
.tions. In some cases the search hai* to be'run in several steps be 
of storage overloading problems fi;om the large sizes of some sets. The^ 
first portion contained terms and/br codes considered essential ,^JDed 
with the limited (il^., to one year) asbestos strategy. This api^'^isal 
was based both on stl^ategies/profiles developed by the head librar$^A 
. during Several years of experience searching health effects and the' 
author's own experience with searching. The rest of each search con- 
tains additional sec tlc^ns of terms arid/or codes to increase recall. 
These were foniulated flom" other strategies developed by the searchers at 
the library, thesauri .an^' seaxclx guides for the. data bases, and' free-text 
-words. Each .^action was lANDed kth the limited asbestos' strategy. Then 
the NOT function was us^d I t|3 determine what the additional terms/codes 
did, or did n6t add to the Recall/and relevange compared with the essen- - 
tial strategy. See Figures 2 and 6 for the strategies on BIOSIS,' 
Chemcon, ahd NTIS. The prokles Vere net exhaustive because^ the expense ' 
would be prohibitive. HoWeVpr, manual methods were used to evaluate' tl^ 
Results in more depth than cbuld be afforded in ' online .evaluatipri, * Fur- 



4^' 



ther discussion of the lianuall^thods appears ia._'"Methodology for 
Analysis of Results" and in "Dat^ Anijalysis". 
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. ^ aa^e,^, important J^^^ ef fee tfi that must be covered Xn a health . 
effects strategy, are ' tokco^ogiMl , carcinogenic, mortal, and other 
pathological effects such as mutagenesis and teratogenesis. Both fr6e- 
text and inde^ef-supplied terms and cddils were necessarily- used to 
achieve this end, i. ..i . j „ ■ . , . 

The primary^ focus of health effects is on humans, mammals, ^nd ' 
mammalian experimental animal's, therefore, fish and plants were excluded 
where such an exclusion was incorporated into an/index code. And, since 
health eiffects are tlie focus, prbfiles did not llpit their focus to 
specific organs or systems ([e.g., cardiovascular diseases). • It should be 
^^^^yJ^P^^'^^^y ^f^t >y using these general^ trategies, these "specifics" 
picked up. \ - ''^^ ' ^ 

Thf expected optimum, precision was that 70% of the citations would 
be relevant (not necessarily use^ful). This figure is the approximate 
percentage of rele"Vance per data base found 'by Long^ in her masters paper 
(1976, p. 38) and is the levfel desired by the head'ai,brarian. ^ 

Methodology for Search Strategy Develdj^ment" 
.. Enviroliife and Pollution Abstfacfes^ 

The approach used foy^ these t^ smaller data B^aseaf was to obtain 

online all citations related to asbestos- for the years given' ^eatlier— 

\ attempting to acWeve , a number of citations as close to 100 as ' poss^ible— 

,.^^^^d Work backwards in developing a health effects strategy or strategies. 

* This me^thod was . chosen because of 1) the "very small numbers of cltatiqnS 

^ that would be found on health effects of asbestos, which would make 

evaluation and calculation of .relevance and precision shaky and- 2) the 

speci-flcity of the indexing would make it necessary to enter an exhatis- 

tive number of terms for even ^an "essential core" /strategy, at'^'consldejr-r , 
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able experifee while producing lit^e inf orj^tioa on general tenns> to- use. 
By working backv;ards, the citations relevant to' health effects 
\ could be assigned manually as with" the method used for BIOSIS, ChemcoA, 

and OTIS described ia^er in "MethodologVfor the Analysis of Results"; 
, '.then-possible free-text and controlled terms, could be identified; among 

these and counted for frequencies; 'and finally, tentative strategies 
. , developed and tested for their recall and precision. ^ ^ 

'. . Methodology for Analysis of Resul't's 

The key measureme.nt devices in the anaflysis of this study, are pre- 
cision, recall,, and relevance. Precision is the ratio or percen^ep of. 
^relevant answers retrieved .comparted' with the total numbeybf tref erences ' 
retrieved ^Saracevic, 1975, p. 327; Vefheijen-Voogd ^dlMathi j sgn , 1974, 
p. 141; Lancaster, 1968,' p,. 56). "In additiott to the ' number oiTrelevant 
references retrieved,/ their precision is considered to be a cri^eri<ji^,|or 
the effectiveness of a database" (Verheiikv^Vaogd and Mathijs|n, 1974,^' 
P- }'*^^* °r ^°JLthe p«rposes of this sUidy, a search term or-strategy or 
•portion thereof i Although Saracevic defines. recall~aF:^the" ratio of 
relevant answers retrieved over the total numb^. of relevant answers in 
the'.f^le" (1975, p. *327; jsee ^o Lancaster, 1968," p-.' 55), it wlii be 
. used in this study ^o deiagnate the numbep. of titations^iretrieved whether 
.^jrelevant or not. It would be ^impossible, give!ii the resources of money " 
and 'time available for this study, .to detetininq^ the attual nuwnber of 



relevant answers in files 



as large as the ones being studied. By com- 



paring the different numbdrs of citations recalled' with different « 

• * ' * . * * * • 

'"prof llfis/tenns and coupling t||iis comparison with.t;Ke y^ars of eJtperience, 
In searchfng' bealth effects of the .searchers at the library, a fairly ' 
reasonable and reliabjle appraisal of what basic or core, strategy,, gives • 



sxapstantial/ if not 100%, recall , of relevant citations can, be obtained.?? 

^d^as Lancaster points out, ^ , / 

t • * ♦ *■'■«■' , •» 

When consider that these ratios ^re merely tools by 
which'^we measure variations in performanc^^witKin our' own sys- 
tem, and within; the confines. 6f a controliLe3. experiment , it is/ 
evident that any method that viil giye us reasonably accurate 
estlmrnresf-of recall and precision ^is- adequajte, as long as we • 
, hold. the method constant throuRhout the evaluatib^^ program . 
^ . Even If the method results in ^lightly^nflated, ot slightly 
deflated, estimates of recall or pret^Bion,'^ siVice the method 
is held constant it will still ^resjulf in performance figures 
that will be 'valid tools to* use ia the comparison *of system \ 
alterations (1968, p. 131). - • ^\ ^ ' j 

Furthermore, both precision* and r^all mitet-be used^gj^er in 
order to .get an accurate picture of what occiir^ because of their 
inverse relationship, i;e., the more precise a search, tl||S, lower the 
recall and vice versa (Lancaster, a968, pp. 56^ 5g-59) . ' ^ 1' 

At the heart of this strudy to "determine effectijjfi&ess' of se^ ^h^ ^ 
^profiles is^the concept of relevance. A source of' longstanding, continu- 
ing discusgfoiT^aii!^^ relevance h^s bee^ considered in great; depth 
(for two, reviews on the subject see Rees and ^Saracev^^c,. 1966 and 
ssarac^vic, 1975). One factor seems, readily agreed .upon, that relevance 
or similar evaluative judgments* have^defihi|e^ub3e*c'tiv€ elemetxxs (cf . ^ ' 
Sw&nson'and Meyer, 1975, p. 143; Fugmann,* M?!, p. 359r Saracevic', 1975; 
pp* 340, 341, 342;^ and Rees and Saracevle, 1966, pp. 9, 16) . .^However , 
agreemei\t also exists that relevance can . be .judged (Lancaster, 196^,^ I 
pp. X20-121 and Rees and Saracevic,"^1966,'pp. 6,^''lO). "Although- it imay 
appear that relevance judgment is a very subjective human process, 'it has 
Associated with it siome remarkable /regularity patterns" (Safacevi^, J.1975, 
p. 342). ' . . * . ' • I ^ 

Since relevance, can be judged^ we are led to .the islsue of wha will 
ju(^e and how the judgment will be made'. This alios t immediately raises 
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the qu&stion of how relevance and pertinence differ* As Rees and 

Saracevic'have pointed out, ** 

. . • a sharp distinction can be made between relevance to a / 
question and relevance ^o the need .underlying a question " / 
(documents satisfying, the^ need are referred to by some authors*^ 
as "pertinent", and the /ones answering the questipn itself" 
are "televant": ,i.e..,^ the real measure desired derives'from 
the* relation' to the satisfactJ^oa of the information need of ^ 
'-the user) (1966^ ^pp. 8-9). ^ ^ • . ^ 

And, since T * - • • ' ' . * 

/ ^ . • ' 

Relevance is the p^op^rty which assigns certain mem- 
bers of a file (e.g., documents) to the question; pertinence^ 
^ is the property which assigns them to the information need [,] 
• • . some relevant answets are also pertinent; but thete 
cpuld be relevant answers that are not pertinent ^ria perti-- ' 
nent answers that are not relevant. It has of ten ^eh ar^ed 
that,, from the user^s point of view, desirable answer^ are 
pertinent aswers; but, in reality, an IR [i-nfoirmdtion ' . 
retrieval] system* can only^ provide relevant answers. ThatVi^, 
a spte© ^n only answer questions^ It can^only gu^i^whac 
the .information need- is ^ In practice, ' there i^s often a real 
tug of war in trying- to satisfy information needs and not ' 
just answer questions (Saracevic, '1975,* p. 332). * * 

Because pertinence is "the subjective assessment by a user against 

his own information needs" and "is confined to those aspects of an indi— ^ 

vidual^s situation that are of 'concern to him; and- may change over ti^e", 

its ^measurement tends to be very specific and Individualized, whereas 

relevance is capable of public assessment and can, therefore; only be 

assessed against a user^s statement of his need" (Butterly, 1975, p. 190) 

"* * . 

Since health effects as a concept used in relationship to^a chemical/ 

f 

pollutant is a br^ad category and is requefsted as a search topic by a -« 
wide variety of people from EPA researchers to administrators to st^te 
and local governmental agencies , to ptivate fir&s^ vith government con- 
tracts, it is intended to give a wide view of a siibstanoe^s generally . 
accepted ^nd potential health effects (usually^pegativ^ but sometimes 
beneficial}, encompassing the entire animal* system. This coupled with 



1) the tendency of differences in intended use of documents to produce" 
differences- in relevance jud jments '"suggesting that intended use becomes 

* * * ^ '\ 

part of the query" ;(Saraceyic, 1975, pp^ 341-342) and these different. • 
•user groups would hay different intended uses; 2) that individual's pre- 
^fet§nces, purposes^and needs, change leading to rejection of citations* or 
similar citations that once satisfied or the reverse situation (Swanson 
and^J^r^ 1975, p^, *142); and 3) that 'the ^ser 'is^^Uncllnqd to judge 
search^ responses to a request "with respect to his subjective, a priori" 
urvdefinable Information need . . / and not with respect to the objec- 
ttve, definable, and ♦well considered .search requirements" mainly because 
of the a^ded time and concentration it. would require to learh and con- 
duct such analyses, "for tH?s would divert them too^muih from' their 
discipline-oriented activities" (Fugmann, 1973, pp. 361-362) would pro- 
duce pertinence judgments and not the desired relevance judgments. 

* This plus the fpllowing factors led to the decision that the author 

' • i 
with the advice and assistance of the- head libraria^ and searcher, 
--^ 

should make the relevance judgments: 

• 1) The -^difficulty in getting a representative sample of judges from' 
the user population not only because of the searches supplied' to. 
non-EPA people but also because of the .complexity of EPA organiza- 
tion. ^ ' ^ ' 

2) Because of' the author 's and the head librarian's subject exper- ' 
ience, by. virtue of .the area of searches petformed and knowl^edge of 
^ the scope of health effects, they are better q.ualifieS to make thl^s 
kind of- relevance judgment than a panel of judges or an individual 
judge. V Extensive, specialized subject knowledge is not necessary 
and would probably be a handicap. 



\ 3> Since there is no ranking of relevance order or evaluating 
whether partially <^ totally relevant,^ the relevance decisions 
are easier to make,- The citations, are^jiid^ed either Velevant ^ ^ 
not. Nor is the quality of citations being evaluated because 
the primarjr purpose of this study is ,to defermine the "ef fectiveiiess 
i of different search strat^egies in retfVevihg^ relevant' citations, 
j Each,citat;Lon was judged relevant if it dealt with any health' 

ekfect Whether negative or beneficial'^ although .the emphasis is on ne'ga- 

. ' ^. ' * 

tive "effects, and whether of primary or secondary importance in the cita 
tion document. Excluded were documents that dealt only with diagnostic, 
methods or ^eatment and did not discuss ^e accual health effects of ^ 
asbestos. -.C . ' ^ , - 

All citations printed from each data base ^ere di'Splayed in tH'e .\ , 
fiiXlest format available for the data base to enhance relevance^ evalu- V 
ajtion. Also, experimentation has shown that "the more completSarecord, 
the more likely is its selection as a hif" (Schipmal 1976, p. 5). 

ThQ. author first evaluated the citations for relevance, then the' 

head librarian evaltiated any citations whose relevance was tn question. 

T&e author was responsible for makiq.g aid findl- relevance judgment^, 

Relevance judgments were then checked to, eit^re that the same j^elevance 

desi^ation had bepn' given to th*^ same citation regardless of what sub*- 
I • ^ , • ^ " 

s^ection of a search- or *in what data base ^t. appeared. ' 

After relevancy judgments xJere assigned, the rfelevanfe citations for 

. ' -'^ . ^, , 

each^ 'Section were counted atid the' precision ^erdfentage calculated for the 
BIOSIS, Chemcon,- and NTIS searches. In ^he Envfrolihe^d Pollutiona 
Abstracts approach, . the precision was calculated several times as a guide 
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for profile development/ More indepth analyses by manual methods* 
followed and are discussed in the^ next phapter.on data analysis., These 
methods were used to sirrive at the final strategy (ies) . ' 
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.p • ■ . IV. DATA ANALYSIS 

BIOSIS (See Figure 2 for strategy and data.; ' > / . 

In the BIOSIS strategies ail codes ^selected are limited' to major \ 
^pnraary-and secondary; . indexing to cut down on irrelevance. This desig- ' 
nation of "major", when referring to. codes in the text, is. denoted by a 
preceding asterisk, e.g., *22506. % - " 

The essential strategy. Section 1, 'uses some toxicology codes, the 
carcinogen codes, and' those for teratology. As can be "seen in the composite, 
it exhibits high reca-11 and acceptl^le precision of 63;^ (set 10;. However, 
thfe asbestos strategy F37 creates irrelevancies in this file because the term 
serpentine (see Figure i; sometimes refers to this soil type instead of to 
asbestos. This problem could be corrected by NOTfng the following Strategy^ 
against the 'final set in the asbestos serial.**^ ' ® . * . " ' • 

. ' 1 SEREEWTINE SOIL . - 

' > 2 SERPQJTINE (3Vr; SOiLS ^ - 

3^S0IL (3Wj SEitPENTINE - " ■ 

1 SOILS .(3W; SEHRENrn^E' 

- cAM. • " 

It would not, in pra'ctice, be worth the ^additionai cost since relatively few 
citatipns^are affected. If this correction is made by d^iiuding irrelevant 
serpentine soil pitations in the counts, the precision increases to 7p^/ 

Secti on 2, t he- first nones sential.- stra tegy, tests pathology fc odes .' 
All the citations overlap those of, sec'tion 1/ Set 18 cpntadjis 0 citations. 
The precision level ;is good, 33;i (set 17;. \ ^ • ^ ' ' 1 , 



Fig* 2. 



CaWSlTS OF. BIOSIS SEARCH SmXEGY*' 



1 633 SCiaAiy;^ F37 (Asbestos stored strategyj 

2 nU l/760uuOQl-6207000i. (1976 accession 'numbers) 



3 2/*776 CC"22506 

4 3079 CC«22508 

5 32282 CC=2/*C07 

6 21030 CC«25552' 

7 mo cc-25554 

8 80090 3^7/OH 

9 60393 3Ab\J 



(Toxicology— Environmental) 
.( — Veterinary J 

(Neoplasms, ana Neoplastic Agents)(Car,cinogens and Carcinorenesis) 
(Teratology and Teratogenesis— Descriptive) ' 
(Experimental; 



10 



CO 



m". .^^^.''^^^^''^ berpontine Soil) 66 Relevant (66) 6^.. Preci^i^ (7(^S) 
7ciY CC=125U2 • , (Patholory. General anc Mi sc. -^,Pnn T-n 1 1 — ~ » ^ ' ' 



M ?6737 CC=125U2 

12 il^iU' CC=12503 

13 48093 CC-12504 

14 132/|9 00=33004 
15148413 10-13/OR 
16 73552 14/mAJ 



(Pathology, General ana Misc. -General J 
( — Comparative ) 
^ ^ —Diagnostic) 

(Veterinary Science— Pathology ^ 



17 12 2A:iu 15 



18 0 17..0r?0 



10 Be levant 



83/j Prec 



CO 



19 65295 

20 IQ%7 

21 26607 

22 2958 

23 97402 

24 73145 

25 4086 

26 4669 

27 1869 

28 401 
2y 17119 

30 13547 

31 24733 
.32 13531 

33 1507 

34 12777 

35 79044 
36l4C'643 



CC»22>01 
'CC=>7013 
CC«37015 
CC=3701^ 
19-23/oR 
23/i^J 
Mutag? 
Tlutat? 
Teratog? 
Teratol? 
Carcinog«eh? 
Cancer? 
Tumor? 
Cai'cinoma? 
i^eoplasm? 
SEilAiy/ CVl 
25-34/oR 
350FJ24 



( Toxicology— General, .I'lethods, and ivxperimentaTj 
(EriViroa-nental Health— Occupational Health; 
( r"^ir, Water, Soil Pollution; 

\ ' , * — t-iiscellaneous; 



(Hutagen(s), Mutagenic. Mutagenesis, etc.) 
(Mutate(s), i'Iutation( s) , Mutating, etc.) 
(reratogen(s"), Teratogenesiq/," etc.) - i 
(Teratology, Teratological^ etc.^ 
(Carcinogen(s), Carcinogenic, Carcinogenesis, etc.) 
(Cancer, Ca/icers, Cancerous, etc.) 
(Tumor, Ta-nors, etc.) 
(Carcinoj^^s), etc.) ' . 
(Neopla^smj^s), etc.) 

(iiortality words storea strategy) ^ 



37 - 2^.1>36 ^y7 without ^Serpentine) ^ 66 relevant .,6A) AAl i J;:^^^^ ^53^) 



7 37..oriO 

39 22097 CC=:l2vAX2- 

40 2i%3 CC=12003 

41 43231 CC=?2504 

42 9;i407 39-41/OR 

43 55296 42/:iAJ 



(Physiology,- General and iiisc— General) 
) ■ — Comparative 1 

( Tpxicology-jPharmacological) 



2 l.elevar.t 28/, Precisio r. 



CO 



M_ 



5 2AI^H3 



kl 39531 
kQ 82535, 

50 42033 
512;3512 
52153V71 



3 /f^-NOf lO 



2 Relevant 



51 



CC=03506 
Ce=G3508 
CC=l3002 
cc=i3003, 

CC=13020 
43-50/Oh 

5i/:iAJ 



0 .\elevant 



_4U/^ Precision 



(Genetics ana Cytogenetics— Animal; 
( * ^ . ;^t|uraan) 

(Metabolisin~-Oenez^l} Metabolic Pathways) 
( , **~Bnei^y and Itespirator^ Mettibolism) 
( — Metabolic Disorders). 



U/g Precision 



7 2Al.r52 



0 53:^0110 



7 Relevant lOQ;^ Precisio: 



o 

•H 
O 

to 



II Si!'c!^^ ' U^-^-nunology^da-nunochemistryy-Gcnerali llcthods;^ 

57 743S CC-Sso3 i ^ ^ ^ -Inununochematolocy (iiicludesDldGrps;) 

53n2S 55^57^^^ ^ ^ ^ ^ -Imnmnopathology (Tissue Immun:;)^ 

59 365U7 53AIAJ " ■ ' - ' ' • 

63^ 3 6u:jrio ' . ^ ^ V .i.. 



iieicyant 




^7.j Precision 



Precision 



*vSj ^laj^e set s'Sef ^^''^''^^ sectipns because o/ storage overload' problem for 



Section 3 tested three different things: l) codes for general toxico- 
logy and environmental health; 2) free-text words for/ some concepts that 

r 

are. covered by CROSS codes (e.g;, mutagenesis, carcinogenesis, etc.); and 
3) the free-text wnds for mortality in Figure 3 that do hot have specific 
\CBOSS code counterparts. While this section as a whole produced several 

Fig. 3. Stored "tnortality words strategy CVI. 

' ' ■ ' ' * S MDRTALIT? 

S DEATH * , 

■ S DEATHS 
• S FATAL? 
' S AUTOP? 

S LEIHAL i . 

C 1-6/OR . ' 

'more citations than Section 1 (cf. -sets lC'ana*27), all but seven over- 
'lapped those in Section 1 (set. 38). Als.o Section 3»s overall precision 
^* was lower. The precision value of the seven unique citations was! only 

Section 4 tested general physiology- codes arid pharmacological toxico- 
logy.; ^JL^;;^^^^ few citations (set 44) i only. 3 not included in the core^ 
strategy (set 45 )f and none, were releva^ti# - \ . 

Section 5 tested codes for genetics (i.e«, mutagenesis J and metabolism 
' in general. ^ It produced no citations not included^in the core (set 54) 
_ ^ven though the| citations it retrieved were all relevant (set 53) ♦ It Is ' 
^ important to^^fjlqember, however, that asbestos* is not considered a mutagen 

per se and therefore would not and did not retrieve; enough citations in this 

* t, • ^ 

area tb make an evaluation* , . * 

, .9 J ^ Section 6 tested codes for immunology. ' Six citations were produced 
at 67;^ prec^ion (set 60), but only 3 were not included the core of 
which I waas relevant (sft 6l) for^ a precision of 33jJ. ' ' ^ 
This s6/far haa been a superficial and strictl^^ numerical analysis 



of resiilts. -Manual delving into the individual sections produces a much 
more comprehensive '''and accurate picture of the strategy's effectiveness. 

In Section 1 *22508 did not retrieve any citations. ^Veterinary 
. toxicology was originally entered* .to see if it picked up^ relevant arti- 
cles that discussed health effect of animals. This may not be'^significant. 
*2$$54. and *25552,.did not appear and both are teratology codes, *25552 
appearing in its only appearance together with *25506 and ^2400?. ' This. 

however, reflects the failure af asbestos to .produce such effects and not 

*• , , * , < 

the utility *of the teratology codes in health effects* seareh profiles. 

^ * » •• 

This 'Will have to be tested later with other chemicals/pollutants as 
should *22$t)8. -Expjerience has sho\9n these codes to be useful" for many 
chemicals/pollutants. 

^ This leaves *22506 and *24007 ^solely responsible for retrieval oT 
all Section 1 citations, relevant and nohrelevant. Table. 1 gives the 
actual numerical breakdown. As can be seen, both must be. used since each 



Table 1. Numerical breakdown of code appearance in Section 1 



Section 1 


Total 
Cits. 


■ *22506 


Alone 


♦240071 


'^Alone 


Together 


Relevant 
Citations 


, 66 


> 64' 




44 ; 


.2 


• 42"^ 


. Kqnrelevant 
Citations 


■ 31 ' 


30 


- 26 


4 


0 


4 



appears as the only index term from the co^* strategy in some of the 
citations. In checking the resul*tant citations from Section 2— -which, over^ 
lapped corripLetely those of Section -1 — *22506 would have retrieved them all 
and only ♦22506 and *24007 of the Section 1. strategy appear in these cita- 
tipns.- The actuaL numerical breakc^Jwri. appears in Table 2. 



Table 2. Nximerical breakdown of Section 1 code appearance in Section 2 



Section 2 ' 


Total . 
Cits» 


. ' *225C6 


Alone 


*24CC7 


" Alone 


Tokether 


Relevant 
Citations 


10 


. 10 


8 




C . 


2 


Nonrelevant 
Citations 


. 2 


12 


2 ^ 


0 . 


C 


■r 



.The citations unique tp the Sectioir ys.trategy did not contain any 
of that section's keywords with the exception of one citation which was ^ 
nonrelevant. The breakdown for the CROSS code appearances is in Table 3* 

Jable 3. Numerical breakdown of code api^arance in Section 3 . • 



Section 3 ' 


Total 
Cits. 


*225C1 


Alone 


*37C13 


Alone 


Together 


Relevant 
Citations 


2 


1 


1 


1 " 






Nonrelevant 
Citations 


^"5 


3 


3 . 


0 







*37C15 appeared once but only in a nonrelevant citation. *225Cl and ^ 
*37C13 together would have retrieved all relevant, unique section citations. 

Since neither Section k nor Section 5'*retrieved^levant citations 
different from. those in Section 1, their codes nee| not be used as thp core * 
strategy^ picks up all of tfebse retrieved. ' However, the genetics anjjx^yto- ^ 
genetics codes *C35C6 ^ *C35C8 still need to be compared against the muta- 
genesis xprds with other substances to test the relative retrieval effective- 
ness of these twp approaches. This is. also true of the teratology codes in 
♦ Section 1 Versus the words in Section 3. The mortality- words need tb be 
checked against other chemicals/pollutants, especially those more immediately 
fatal- than the slow-acting asbestbs, to test their utility. 

Sectioa 6 retrieved 'two relevant-^citatibns unique from Section 1. 



r 



All three of the unique citations contained only *345C8 from this section's 
gtratpgy. However, both these relevant^ citations contain the two CBDSS 
codes that retrieved all the Section 3 unique citations^ The brealcdown 
l^appears in^Table 4» Since both of these relevant citations were included 
in the unique citations of Section 3, the Section 6 codes need not he used 
iiv the Strategy. 



Table \4. Numerical breakdown of Section 3 code appearance in Section 6 



Section 6 


Total 
Gits. 


. *225C1 


Alone ^ 


*37C13 


0 

Alone 


Together 


Relevant 
Citations 


-2 


1 - 


1 


1 


1 


c " 


••Nonrelfevant 
Citations 


1 


6 , 


C 


0 


0 


0 



To arrive at the final strategy, '^hen, only Sections 1 'and 3 need toj 
be considered for asbestos health effects. " This is also probably true .for ' 
health effects. of other chemicals/pollutants. By combining the useful 

■. -• ' i 

■^codes in these two sections, the follovdng asbestos health- effects strategy 
-develops: , 

GC«225C6 . (TpXicolog5HC*fiivironmental) 

00=24007 (Nepplasni/^oplastic Agents'-»-^arcinogens/Carcinogenesis) 

C0«225Ci (Toxicology— General) • 

00a37013 (Environmental Health— Occupational Health) 

. • - 1-VOR , / . 

5/maj ; * :~ , 

* V» — 

The results would be as follows: . " C 



97 (94 without Serpentine Soil) Citdiiions ftom Section l (S^et 10) 

7 , . ^ . Citations from Section 3 tinique 

. / \ f roffli Section 1 fSet 38) ' ; ^ > 

104 (101 without Serpentine Soil,; Total Citatiohs . ' 

66 Relevant Citations from Section 1 ^ - ^ ' 

' 2 Relevant Citations from Section 3 (Se^ 30) unique from Section 1 
68 Total Relevant Citations ' - 

68 ^8 

ic 100 » 65)4 Precision or (correcting for x 100 « 67/ 

Serpentine Soil) ' 



Fig. 5* 
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COMPOSITE OF C A CONDENSATES SEARCH STRATEGY* 




(Asbestos stored strategy} 
85999999 (1976 , accession numberg^j 



t 



1 225/f SERIAL F37 

2 539 iMCOOOOl- 



CM 



4> 



3 387/f5 
/f 1738/* 

5 2A67 

6 1453 

7 /*3827 



SC«CACO/| 
TOXIC? 

HEALTH/TIjDE 
HYGIEN? 



U^oxicolo^yi' 
(Toxic, Toxicity, Toxicology, etc.) 

(Hygiene,* Hygienic, ^tc,)* 



8 ' 2Mm 



3-/i Relevant 74^Precisio:i 

^Carcinogenvs|, Carcinogenic Carcinogenesis, fete,) 
(Cancer(s),^ Cancerous, etc,)' 
(Tumor(s) etc. J , ' 

{Carcinoma(s) etC^) ^ \ 

(Neoplasra(sj etc ,4 

(Mutagen(5), Mutagenic, Mutagenesis, etc.) 
(Mutate(s), Mutations^s), Mutating,, etc.; 
(Teratogens, Teratogenic, Teratogenesis, etc.) \ 
(Teiatology, Teratological, etc.) 
(Mortality wdrds stored strategy) 



CM 



9 391k 
10 1C138 
n 8328 

12 1210 

13 3801 
Ik 2185: 

15 2750 

16 1127 

17 ISO 

18 lk31 

19 27660 



carcihjgZn^ 

CANCER? 

TU^DJt? 

CARCHO;^? 

NEOPLASM? 

lilTTAGEN? - 

I-IUTAT? 

TERATOG? 

TERAINDL? ' 

*SERIAI#4CVI 

9-18/QR 



10 Relevant ICCro Precision . , 



20 10 ^UD19 



21 



0 2O:^0T3 



22 2kh3k 

23 5387 
2k 835 

25 237 

26 9609 

27 1336 

28 13900 

29 55722 



SC=CA059 

SC-CA0O3OO5 

SC=CAC030C6 

s(m:aco5CC5 

SC=CA013CC2 
SC«CA0130C/» 

SC-CA0130I3 

22->28/0R 



\Air PoIIutioft and inaubtSBi Hygiene) 
(Biochemical Interactions — Mammalian Systems) 
( -^uraan Systems) 

(Agrochemical-— Mammal (rodenticides, etc.;) 
(Mammalian Biochemistry— Metabolism) 
( --Genetics) 

( —Other (gen. physiol. chem. stud.)') 



20. 



31 2AI^D29 



5 Relevant 



16-^ Pt^cision 



31 24 3CN0TB 



1 •Relevant 



4> Precision 



(Mammalian Pathological Biochem.—Hetaool. & Hered. Diseases 



o 

CO 



32#369/* 

33 3673 

34 762 

35 618 

36 2062 

37 1077 

38 905 

39 4777 

40 ' 1663 

41 19236 



SC=CA01/*003 
SC=:CA01lfC04 
3C=CA01/»C05 
3C«CA014CC6 
SC^CA014CC7 
SC=CA01/f0C8 
SC«CA0HCC9 
SC»CA014010 
SC=CA01/»013 

22=k]M 



) 

— Oi^an & Gland. Diseases) 
— Digpst. & Excrct. Diseases) 
— Reprod. Dis« & Preg.) 
. — Circuit &i Resp. Diseases) 
— Nexvous it Sens. Diseases) 
—Blood DyscrasisJ / 
—Cancer (neoplasia); 
—Other) 



0 2ANDU 



*Run as more:jthan one searclL-Xtwo searches),* 



■J. 
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strategy, further tests on te/atogenesis, mutagenesis, and' mortality words 
..need- to bj^ undertaken with other chemicals/pollutants* 

Section 3 includes a variety of potentially useful subject codes from 
air pollution and' industrial hygiene of the Applied Chemistry and Chemical 
Engineering Sections to biochemical interactions, mammalian biochemistry, 



and 'agrochemicals of the Biochemistry 



Sections. This produced a fair 



amount of citations but precision was a low l6j^. Of the citations not ^ 
included in Section 1, onl^ one was relev^t giving a 4/S precision lvalue., * 

Section 4 contains specific subsections ef CA0l4,*'the section on 
mammalian pathological biochemistry. It produced no citations^ on asbj^^tos. 

A ^ore. indepth aja^l^is of Section 1 appears in Tables 5 and 6. 

Table Numerical breakdown of searcH terra appearance irj Secti6n 1 ^ 



Section i 


Total 
Cits. 


sc=. 

CACC4 


-A^one 


Toxic? 


Alone 


HEALTH 

/ti.de 


Alone^: 


HYGIEN? 


Alone 


Relevant 
Citations • 




30 


22 


10 


0 




^ 2 


' * - 

, 0 




Nonrelevantr 
Citations 


12 . 


8 


6 


3 


1 


V 3 • 


3 


- 0 . 





Table 6. Numerical breakdown of search term co-appArance in Section 1 



Section 1 


Total 
. Cits. 


SC«CAC04 & 

T0XIC? 

Together 


TOXIC? & 
HEALTH/TI,I)E • 
Together 


sc=cacc4, toxic?,. & 
health/ti,de • 

'Together , 


Helevant 
* Citations 


34 


7 




* 4 - - 


Nonrelevant 
Citations 


12 


2 


' P 


—0 ■ ^ 



None of the citations contained the free^-text' term HIGIEN?. 5C=CAC04 was 
by far the mos^wwerful in retrievihg cita'tions. TOXIC"? retr^-ieved no 
. relevant citations alone, but HEALTH/TI,. DE did. 

Since Sections 2 and 4 added no additional citations ta the -Section* 



ERIC 
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1 strategy, Section 3 reiaains for considejc^tion. It addeii^' dnly oi\e rele- 
vant citatibn (set 31) and had a low ^precision of«-^iy 1$."- The only sub- 
ject code in Section 3 retrieving citations was £!Afc59«' A b;t;eakc|ibwn of ^ 
retr$ey^ by its subsections is in Table* X. Constructirig a strategy 

Table ?• Numerical breakdown of CA059 subsection appeai^ance irf^ection 3 









A 




-A 




A 




A 




A 
















1 




1 




1 




I 












0 




0 




0 




0 




6 




0 ■ • 1 


'section 3 


Total 




n, 




n 




n 




n , 




tr 






Cits. 


COO 


e 




e. 


002 


e 


;0C3 


e 


004 


✓ 

e 


005 


e 


fielevant 
Citations 




c 




» C 




1 


■'1. 


c 




. .0 




• 0 • 




Nonrelevant 
Citations 


23 


5 


5 


1 


1 


•1^ 


12 


— 


k 






1 


1 



using search terms that retrieved relevant citations produces the follovdjig 



profil^: 



1 SC=0A004 

2 HEALTH/TIjDI! 

3 SC»CA059002 ' ^ 
k 1-3/OR ^. / ' 



the results would be as follows: 



\ 



. / 



A6>JJitations from Section 1 
. 13 Oitations using SC^CA059002 
* , 59- Tdtar Citations " / * 

' " '34 Relevant ' CitatioAs from Section 1 

. 1 RelevgSit Citation using SCgCA05900;g, 
C " ' 35 Total JJelevant 'Citations . . • ' 

ll'X id^f « 5^ PJ^cision ' ' \ ^ 

Although CAC59 includes jco^city of air pollutants to^ humans and 

« 

other animals and industriLal hygiene, partipularly subsections C02 tAir n 
pollutants and pollutionX and C03'(Industtiar hygiene^), its. scope is much 
i)roader than this and thus introduces a liigh level of irreX^v^ce when 



used in a health effeci^s search. A searcher might, with ^lear conscience, 

- -A 



^30 ' 
opt to leave it ou^, using the profile in -Section 1, The author opts for- 

^^^^^ * ' r * 

the whole profile in Section 1 rather than the abbreviate(i "SC=:CACC4 OR 
HEALTH/TI,DE" version of it that would retrieve all relevant citations 
'for asbestos health effects but the SC=CAC590C2 indeJced one. TOXIC? and 
SlYGIEN? need to be tested with other che&cals/pollutants, especially with 
ones whose effects are-aJore immediate than those of asbestos. The words 
portion Of the strategy also warrants further investigation. 

♦ NTIS (See Figure 6 for strategy and data.) \ ' - 

Vftien dealing with t'he^subject codes preceded by CF=, one 'digit code'fe 
raust be entered both with and 'without-^^tie preceding 0 placeholder, e.g., 
CF«C6T? and CF=:6T?. This is* becauseVhese codes hay,e beenfapplied both 
ways* at varying times. All codes are truncated because "*s" indicating 
use as a major descriptor have also l^een variously applied. 

Section 1 of the NTIS search contained HEALTH/TI,DE,ID plus' the index 
codes f or EnYironmental Health; Ebivironmental. Biology j j^ustrial Medicine; 
Public health, Jtiygiene, and industrial medicine; and Toxicology^i Only 20 



citations*! a fair amount, were retrievejd as the date basq^s smaller than 
BIOSIS and C'h^on (set 17) • IJowever, its precision of 65y^ is gqod for 
this data ba^ since NTIS tends to have a high l^evel of ii^levancy because 
of indexing and abstracting practices ♦ ^ 

Section 2 contained the mortality words ^rategy CVI' (Figure 3) iibich 
prodviced no\citations about asbestos (set.l9)# , " . ^ 

, Section 3 contained index codes for pathology, gerie.tics, physiology, 
and chemical and biological warfare which ^also produced no citations about , 
asbestos (aet 27); ^ ' 

Sectipn 4 contaihs^Xree-text words on carcinogenesis, mutagenesis,, / 
an& teratogenesis* This sectionjhad low x>etz^eval and mediocre precision 



Fi^lire 6. 



l/m SEARCH STRATEGY 



340 SEHUl// F87 (ASi3IJSTU3 stored strategy) 
Z l/Am85H2-A729^l4 ) 
52 1/C5571C1-C7.S0514 ) 
I 1/D001U1-D0115K4 ' 
55 2-4//3H 



21 
4? 



(1976 accession numbers) 



^Environmental Health) 

Biol, and Med. Sciences— Envi^tjnmental Biology) 




b 

7} 

10 

11 

1^ 
13 

14 



882 CF=68G? 
1020 (}f=C6P? 
3689 CF=6F? 

315 'CF«66J? 

756 CF«6J? 

619 CF=K)6T? . 
2433 CF«6T? 
1606 CF«57U? 
2000 CF°57Y? 



15 11503 HEALTH/TI.DE,ID 



) 

—^Industrial ^occup,} medi^ 

-^Toxicology) 
" ) 

—Public health, hygiene, &. ind« med<) 
-^Toxicolpgy)- 



' 16 18661 6-15/OH 



17 20 . SAnPldT 



1^ Relevant 65^0 Precisioir 



18 2 709 SiLiaxUr CVI j Mortality fiords stored strategy; 



19 0 5AND18 



20 840 

21 3^397 

22 294 

23 5651 

24 1698 

25 301 

26 8367 



CF«57C? 

CFp57S? ^ 

CF=06P? 

CF=6?? 

CF«57S? 

CF«=74D? 

20-25/OH 



( 
( 
( 

X 



— ^Pathology) 
—Cytology, genetics, 
—Physiology) ^ 
—Physiology) 
—Physiology) 



k mole. biol. 



(Military Sciences — Chelmical, biological, and radiol* warfare), 



27. 



*0 5AI1D26 



28 506 

29 1261 

30 347 

31 112 
32' 1560 

33 . 316 

34 807 

35 . 25 

36 122 
^7 ^589 



GAriCi:^oGsrj? 

GMCEH? 

TU^DR? 

dHCIK0:4A? 

iii30PIA3M? 

MUTAG^:^? 

MUTAT? . 

TARATOG? 

TEiTATOL?, 

28>>^6/OR 



(Carcinogen(s), Carcinogenic, Carcinogenesis, etc.) 
(Cancer(s), Cancerous, §tb«) 
(Tuaior(s), etc.) 
(Carainoraa(s}, etc;) 
(NeoplasmCs-^,- etc.)* ' ' * 
(Mutagen(s), Mutagenic, Mutagenesis , etc.) 
(Mutate(d)t Mutating, Mutationals), etc.; 
(Tera,toigen(s),, Teratogenic, Teratogenesis, etc.) 
(Teratology, Teratological, etc.) * - 



2 Belevant Precision 



38. 4 ?AtJp37 



39 1 38N0T17 



^ Relevant 



go Precision 



Of 50;© (set 38) • Jt added no new relevafit citations to the Section 1 
• strategy. 

^'By ahal^-zang Section 1 in greater^ depth, two breakdpwns of codd 
appearance are possible and ard" shown in 'Tables, 8' and 9. For heal^th , 

table 8# Bi^eakdo^m of search term apj[^rance in Section l 



r 






A 




A 




A" 




A 




A 




A 




A 


» 






1' 


CF= 


1 


CF= 


1 


CF- 


1 




1 




1 




1 








"0 


C6F? 


0 


06J? 


0 


C6T? 


0 




0 




0 




0 


< 


Total 




n 


or 


n 


or 


n- 

* 


or 


n 


CF« 


n 


CFe 


ti 


health/ 


n 


Sectioh 1 


Cits. 


68G? 


e, 


6F? 


.e 


6J? 


e 


6T? 


e 




e 




e 


TI,DE,II> 


e 


Relevant 
Citations " 


13 . 


8 


'0 


0 . 




10 


0 


16 


6 


9 






0. 


la 


1 


^Nonrelevant 
Citations 


8 . 


3 ' 


0 


3 


1 


1 


0 ^ 


'1 


0 


2 


0 


0 




5 


3 



Table 9* Use of Section 1 search terms by relevant and nonrelevant citations 







Cit-. 
No. 


CF« 

686?,.. 
— 1-^ 


CF= ^ 
C6F? 
or 
6f? 


, cf= 

06J? 
or 
6 J? 


CF5.9 
06T? 
or 
6T?* 


CF= 
57U? 

> 


CF=i' 

571?, 


HEALTH 

• /ti.de.up • 


3 * 
> 


« 

g 

»> 


1 








X 








2 






X 




X 




X' 


„ 2.. 


^ X 




, X ' 




X 




X 






<- 


















' ? 


X 




X 


X 




X 




■ • X 


• K 


X- 


X 


X 




. X 




X . 


X • 


.X 




X 




X 




X 


X ' 


X 




• X 




X 






X 


X'' 




X 


10 






X 


' X 








11 


X 


i 


X 


X 


' X 


X 


X 


12 


X 




X . 


X ■ 


X 




X 




■ ^ 




' X 


X 


X 




- X 


1 Nonrelevant ( 


n 


1- 














X 


2 


X 


X . 












..,2 .. 














x ■ 
































X 


-1- 


X 




X • 


X 






■ X 


7r.., 


X 


X^ ' 






X 




X 
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effects of asbestos, only. two search >terps would have been needed, ' 
H^TH/TI,DS,I1) and CF=6T? or C6T? The precision level also would* have 
increased to 72/o since two nonrelevant citations (numbers 2 arifdH in !'Table 
f )^buld not^ have-^een retioeved, , , . /^''^'^^^ 

However, it would be premature to generalize so spartan a strate^?!f^'f 
fimtjie results for a^stos. Notice, for example in. (Cables 8 ahd 9, 
that ajl but CFs=.C6f?/CF=6F? appeared in the relevant citations^ Based on 
this core strategy, the author believes that it should remain as is but 
be :test6a^'further for verification, especially CF=C6FT!/CF=6f?. TOXIC? wa« 
liot used in this test and might be worth testing agains*t tfie toxicology 
codes. As with BI0SI3 and Qhemcon, the words and/or codes on carcinogeni-,* 
v'sis, mutagenesis, teratogenesis, and mortality need further testing by 
using other chemicals/pollutants, ^ 

Eoyiroline . . * ^ 

By selecting all the citations on the asbestos terms for the volumes 
corresponding to 1975 through 1976 (see Figure 7)f it was possible to ^ * 
work backwards to arrive at a strategy. First ,^,relevant citations were>. 

Figure 7» Asbestos search stfategy f or^years 1975 through 1976. ^ 

/ ..^ ^ ' 

1 235 S£r3:aI# F87 (Asbjgtos stored strategy) m 
~2 97 1/8C(X)CC-1199999.V1975--1?;^ Accession, numbei^s) ^ 

, 3^Ss^levant 37.^ Precision^ .-^ ^ 

detera^ 

the, citations. The circle "term3>in ndhrelevant citations were later used . 
to calculate total citation^ that wobiLd be^ retrieved from a^ strategy and 
it^ precision. . # 

The number of relevant articles containing each term were counted. 



HE/ILTH, HAZARD?, and HYGIEN? were further subdivided by whether they . 
appeared in the titl6 aiid/or descriptor of the citations or in any other 
location of the citation. See Table 10. * 



Tabj.e 10. Breakdown of ^tent^lve search term appearance in relevant 



citations 



Teatative Search Term 


No. of, citations 


RC=02 (Che.nical- and Biological Conteinination J ' 


31 ■ - 


mCALTH/TI.Di:;, 


25 . ' 


HEALTH not in TI,Dii; 


" 15 


HAZAKD?/TI,Di ■ , . 


. 2 . . 


HAZAkET? nob in TI,D£ _ , / 




TOXIC? 1 


^ 2 • - . 


O A r>0 T!.T; \ O ^ - - 

OAKOINQ? ^ • . • 


21 




21 


PATHOL? 


13 


DI3EAS:S? 


8 . ■ 


DISORDERS 


5 


EXPOdUiiS? . 1 • 


15 


ADViiHSi: . O 


^ 2 ■ • 


HYGIKH? in TI and/or D£ - - 


2 . 


ilYGIEIi? not in TI.DiS 


■ . -3 


BIOLOGICAL 


• 5 ■ 


EPIDEMI? ' 


1 . 


MORTAL? 


1 


DEATH » 


3 . - 


FATAL? ■ ■ 


1/ 1 ■ 


NEOPLASM? - ^ A 




MALIGNAN? 


1 



A tentative group of words was selected and evaluated as] in T^able 

'V 

11# fiC=02 is not included in this, table. ' ' ' ^ 



Table 11; Appearance of tentativg^search terms in relevant and^non- 
-relevant citations 



ERIC . 



Terms 


No. of 

Relevant 
Citations 


No. of 

Appe^^^ces 
Alone 


No. of ' 

Nonrelevant 

Citations 


No,' of 
Appearances 
' Alone 


HEALTH/TI,D3 


25 


C 


17 




HEALTH/AB 


■ 13 


.1 < 


12 ♦ 




HAZAPJD? 


7 


1 


5 : • • 


tr-^ 


CARGINO? 


21 


' C 


16 




.CANCER? ^■ 


21 


0 


11 




PATHOL? 


13 


C 


W 


1 ... 


EXPOSURE. - . 


15- 


C - 


8 


0 


DEATH!«* - - 


? 


. 0 


0 


0 - 


TOXIC? 


. 2 " 


0 


. 3 


■ ■■•Iv . - . 


lOTAI* ,rjTs. 


■^6 l<«=flftv?^tit. 
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Since only a very small number, of -the search words that were considered 
' important, either because of actual numbers of citations retrieved or 
relationship to health effects terminology (e.g., TOXIC?), appeared as 
the only tentative search term in a citation; each citation was checked 
for each of the terms in Table 11 plus RC«02. ^ See Table 12,^' 

Table 12. Appearance^ of tentative search terms in relevant citations 



Cit. 
^ No. 


HEALTH 

/ti.de.ab 


HAZARD? 


CARCINO? 


CANCER? 


PATHQL? 


EXPOSURE 


DEATH 


TOXIC? 


02 


1 


X 
















X 


2 




X 














X ' 


3 


X 




X 


X 




X 






X • 


^ 4 


X 








X 


X 








5 


X 


- 






X 










6 


X 




, X 




X 












X 










• 






X 


8 






X 




X 








,X 


9 


X 




• X 


X 










X 


10 


X 


f— ^ 






X 


X 






X 


11 


X 


4 — H 


X 




V 


x ^ 


- 






12 


X 








X 








X 


13 


X 








• 


X ^ 






X 


14 


X 


X ^ 


X 


X 












15 


X 
















X 


16 


X ■ 




; X 


X 


X 








X 


17 


X 


X 




X 




X 






X , 


±o 


X 
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Table 13 shows the eight citations which are not retrieved by 
HEALTH/TI,DE,AB. By using the l^ealth effects strategy in Figure 8 

,<' ' f* ' ^ 

Table 13 . Appearance of bearph terms in the eight citations not retrieved 
by HEALTH/TI,DE,AB' - * . 



Cit. 
No. 


HAZARD? 


CARCIKO? 


- CANCER? 


PATHOL? 


EXPOSURE 


DEATH 


TOXIC? 


RC= 
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with F87 (see Figure l), all 36 'relevant citations would.be retrieved. 
Figure 8. Enviroline health effects strategy number !•' 

1 health/ti,de,'ab ' , 

2 HAZARD? 

3 PATHOL?, 

4 CARCINO? 
. 5 lr-4/OR' 

- * 

Looking at the nonrelevant citations using these search terras and RC=(X2 
in Table 14| shows 37 nonrelevajit citations^ would be retrieved* Figxire 
9 shows the statistics for the resultant search strategy. 

Figure 9* Effectiveness of strategy nun4)e3* !• 

36 Relevant Citations 
. 37 Nonrelevant Citations "\ 
73 Citations- - ^ 

- ^ X 100+,= 49)^ Precision 

This precision is an important improvement over just selecting' asbestos, , 

• *" 

but still modest • ' 
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^Tat)le 14* Appearance of .tentative ^search terms in nonrelevanr citations . 
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Another simpler search approach ,wo\ild be the following; 

, \ <^ 1 HEALTH/TI,I}E,AB 

2 RC«02 

3 iOR2 



This vbuld result in 



36 Relevant Citations 
43 Nonrelevant Citations 
79 Total Citations 

II X 100 « 47^ Precision 



While this lowers the' precision, it has the adjjiftitage that RC=:C2 '(Chemi- 
cal and Biological Contamination) ii<^I33Sff*the numeroixs health-related 
Enviroline keywords within it>^ e.g., carcinogenic agents; health, ew:? 

pathology,, human. This still needs to be tested with^other chemicals/ 

* 

pollutants, but the approach looks favorable* TKis leve^. of precision 
is certainly acceptable for this data base because 'of tflb way in which it 
is* indexed. ^ 

It' would also be wise to test^the longer strategy fiirbher, espec^lly 
in the work areas of toxicology,* mortality*, mutagenesis, teratogenesis, 
and carcinogenesis^* - ^ 

Pollvttion Abstracts - 

Th? appanoSch jfor developing the health effects strategy In this data 
base was very siraaoar to that for Envlj^line. ^Pollution Abstr^ts is ^ 
even more7 specif ia^ln its indexing than .Enviroline. and has may fewer 
indexing words from^^ch to choose. Also, Pollution Abstracts has no 
indexing codes. 

All cj^tations for the asbestos serial ^^87 were, selected and^then 
manually limited to the years 1^^ - 1976. This could also be in the 
folj^owing way I • ■ .^-^ ^ - ^ ^ ; ^'^^ , ' ^ - \ 



1 SBRIAI# F87" ' ■ 

' ^ " -2 Yft.76 

^ . ' 3YR^75 . . 

■4YR»74 ■ ' . , 

5 XR;»73 • 
. 6 YR?.72 = ' " . 

7 2-r6/08 

8 1AND7 

The result was, 95 citations,- 48 of which were relevant, with a precision 

After the relevant citations^ere determined, likely pandidates for 
■search terms were cirpled in all citations* The number of relevant 
articles containing feach term were counted and whether or not this 6andi-- 
date was the only cajididate for search term appearing 'in the, citation was 
noted* Table 15 shows this approximate count.* Then ten terms were selected 

Table 15t- , Appearance of tentative se^rclf tferms in relevant citations 




• Tentative 
Search 

Ter/ ' ^ ^ • . 


' Number of - ■ 
f Relevant 
Citations 


^ Number of Citations 
Where j^earch Terra 
Appeared Alone 


hMth/ti.de' 


21 


. 2 


HEftLTH not in TI.DE 


12 


1 \ ' 


MGIEN? 


- ■ r ■■ ..v 


2 • . V 


,UAZAED? 




■1 


/PATHOL? 




. ■ -3. 


DISEAS3? . , 


>. 9 -f- 


2 


TOXIC? 


r 


1 , 


CYTOTOXIC? - 


,,• k 


2 


CARCINO? V ■ 


• ■ 12, ... . ■ 


1 , 


CANCER? . ' 


U ■ 


I I 


MUTAG? 


1 




TSRATO? . - , 


• ■ 1 -.. . 




EPIDEiMI? 






•MDRTAL? 




• |C ■■ . 


AOTOP? 


1 , -■■ 


^. . ' 1 . 


BIOLOGIC? 




■ . ^ • V. Q . . ■ 


No terms ^ 


* - ■ . • 





4: 



to be checked f;or appearances^ in relevant citations on the basis of their 
being the only candidate se^t^cn tont used in at least one ^relevant cita- 



tion* Table 16 gives this list^;« 
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Table l6. Appeafance of tentative search terms in relevant citations 



Cit. 
No. 


HEALTH 
/TI.DE 


HEALTH 
not in 
TLBS ' 
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X 






X 
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X 






^ -■ 


« 




22 






V 




X 












2? 


X 




• 


1 


' 4- 












24. 


X. 






X 








• 












•• 




X 












26 


X 


X 




- 


- X . 












27 


X 


X ' 






X 


. X 


















» X 






X 










29 














-X. 








6 






X . 








V 
















X 




X . 


















X 




X • 






X 












- 




- 












' X ; 


















X 






















X 






27' 
















X 










■ 








X 












X 








< 


' X 












X - 




X 






X ■ 










41 
























X, 
























































X' 






















X 




• 


\ 


46 






X 


t 














47,.. 


X 














X. 






48 1 





















.41 



Note that after selecting the ^0 terms for the tentative strategy, the 
frequency of some ' of the terms being'-iAe uniqiie search term increased. 

These ten terias would retrieve all but two of the relevant cita- 
tions. In this case, the citations lost arje som^ of the less specifi- 
cally health effects citations. One is on the !Cringe-and concerns, en- 
zymes, asbestos, and detergents; •the other concerns the amount 6f 
Chrysotile asbestos in lungs of New York City residents, or bioatcumula- 
tion. Thet Ipt^ falls into the area of a specific syndrome,- i.e;. 



effects onZungs, that would be picked up in a search specifically inte- 
rested in the pulmonary health effects of asbestos. But the piirpose of 
this study is more generally^ oriented, as stated earlier, and thus the 
general . strategy of the ten search terras fills the health effects needs* 

„ Checking for these ten 'search terms in the honrelevant citations 

produced the results in Table 17. ' 

Table 17. , Appearance of tentative search terms in nonrelevant citations 



Cit. 
N9. 


HEilLTH, 


HEALTH 
not in 
TI.DE 
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[T he st rategy would be ^ ^ • ^ < 

• - * • " ' 

1 HEALf H 

■ ' 2 HYGIEN? _ ' - . 

- ' 3 CAiiCLNO? 

4 HAZARD? ■ ■ .\ 
. . ^ , 5 EPIDEMI? •" • . • 

6 DISEASE? ' \ ' r 

• , . ' ' 7 PATHOL? ' . 

S CYTOTOXIC? ^ ' 
• ■ , "9 AUTOP? 

10 1-9/OR 

Normally in searching^'health efrects, H^TH is limited to fl'and DE and 
sometimes also ID and AB,. il0i*©ver, in Pollution Abstracts, HEALTH^-appears 
as the only search terra outside these boundaries and, thus, must not be 
so limited. Consolidating the 2 "HEALTH"s 'shortens the strategy to nine 



.terras. 

The result of this strategy would be 



as.f 



'ollows: 



46\Relevant dilations 
X7 Nonrelevant eitations 
63 Total Citations / 



|| X 100 r, 73;S Precision 
. Several 'terms need mdre testing. These are the toxicity; mutagene- 
sis teratogenesis, and moiiiality words. Since asbestos is n6t^ 

< 

stance which rapidly produces toxic effect^V it does not adequately test 
thia concept or that of mortality; ^ ' .:- - 

' ^ / ■ 



CONgLUSION^ 

In running tjiese tests again, the author suggests truncating 
AMPHIBOLE to AMPHlklLE? in the asbestos- serifiS^o ensure picking up cita- 
tions under the group name araphiboles in the rare event whQn asbestos' 
might not' appear in^the citation or abstract • Alsp.,- the term IliLIGNASJ^ 
should be addeci to the portion of strategies testing cancer words^Its 
absence in this case jis no^ critical, Af^ er developing the st raises 
for Enviroline and Pollution Abstracts,' it appears that the terms EXPOS? 
and HAZARD? shoulcf also have Veer^t^ested in Chemcon and NTIS. , w 

As Stf'es^ed ^throughout this sp^dy, the strategies and results are 
tentative* Al^* fiv;^'data<^ba^es need testing in several, areas, inciuaihg 
toxitfplogjri - mul^ageneslis^^ c^cinoggnesis, teratOgenesiS|.and raortalitjr 
words; Different stib^ttnces which, prd(iuce/?feuch';'effects''*tieed tb^be tested 
to see how' useful *thes^ suWiv^s^oM s^^dhes,a|e« .dy ascertaining 
how useful these elements are^ in J^etriev^xy- cilaticfhs for sub-^ 
stances known tP produce these effe^^ a Jpjrt' of checkUst for Ke^th ^ 

Ifi certain codes in a 



ef^cts 6f ^substances can be i>roduced. ,F6r\^: 

given data base have been vexlfie* |^s/retrievijig Jat acceptable l^el 
^f^^cision, relevant citations on thele sUbdiAsicms of heafth effects 
when the ftee-text woxxis. in this ariea do not. rtt these samo^Kbdes are then 
used in a search on' a particular ^health -effect, such as tei^ltogenesi^, and 
no relevant citations are produced^ the j^wbability is ;high that no docur ' 
ments/artl^SlessC^ thig;^ effect have been enier^, into' the. data base. 

AXtbotigh this is a preiitainary study, definite trerjds still can be / 
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ascertained for these data bases by testing with asbestos, tind are repre- 
seated in the results^ for each data base," Codes work weli» in BIdSIS; 
Chemcon iand NTIS require a corabiiuftion of codes and words; on Enviroline 
either words alone or words ^d coass together can be used; and Pollution 
Abstracts, of- course, requires all wordbvwhose identities can be pin- 
pointed a&'was done in this stvidy. 
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APPEND^ A, ^ST OF AIDS IN PEOFILE DEVELOPMENT 

♦Biosciences Information Service of Biological Absti^cts, ' BIOSIS. Search 
Guid6: BidsiS' Previews Edition . Philadelphia, Pa.:, BIOSIS, 1977. 

^ . CROSS Cocle . Philadelphia, Pa-.J. BIOSIS, n.d. / ' - ^ ^■ 

, A Guide to the Vocabulary of ilplogical LitSratttre . Philadelphia, 
' Pa.-^ "BIOSIS, 1973. , • - • , 

. Profile Guide . Philadelphia^ Pa.: BIOSIS.', n.d. 

Sub.iect Guide to the CROSS Index . Philadelphia,. Pa.: « BIOSIS., nidV 

chemical Abstracts Service. Sub.iect Coverape and Arrangeatent of Abstracts 
/ ^ by Secticms in Chemical Abstracts . 1975 edition. TJ^lmb Ohio; . 
American Chemical So.cietj^ 1974.'* '% ' ^ >^ 

Environment Information Center ^ Inc. .c. £nviro^ine User l^bal . New York: 
Environment Information Center, n.d* 

Lockheed Infonktion Systems; * Brief Guide to DIAIOG.vSeaithin^ . Palo 
Alto, Ca.; Lockheed Information Services, i976# 

Lockheed Retrieval Services Information -Systems Laboratory* DIALOG 

Tetmjnal Users "Reference Il^ual . / 2. vols. Palo Alt o^ Ca.: Lock- 
heed Idissiles and' Space Co;, n.d". . ' " 

• ^ 

Lockheed Information Systems. Online ihfprmation on data bases axid 
limiting. March.13, 1977. 

Stamm> Rpy'Pt^ and.Ryersqp, Ted. NTIS Sub.iect Classification (Past and V 
Present ) ♦ Springfield, Va. : National Technical Information Se2>- 
vice, Nov. 1975* NTIS/SR-75101. 

Strategy cards prepared by EPA-RTP library's searchers. • 

Usin^ ^CA Condensates arid'^CASIA # Presented at the DIALOG llser's^Woxicshopl^" 
Gfiicagd, 111."" Jul^ 16-17, 197'6. . . 

Pollution Abstracts. PoUution Abstracts Keyword Majster List . Lotdsid.lle, 
* Ky#; -Data Courier^ 1976.. ^ \ \ - . 



♦Received at' EPA-RTP. liblaxT^U^ sedrch strategies were - developed 

and run. - - ^ ^ . ^ 
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